TR Discover: A Natural Language Interface for Querying and Analyzing Interlinked Datasets
نویسندگان
چکیده
Currently, the dominant technology for providing nontechnical users with access to Linked Data is keyword-based search. This is problematic because keywords are often inadequate as a means for expressing user intent. In addition, while a structured query language can provide convenient access to the information needed by advanced analytics, unstructured keyword-based search cannot meet this extremely common need. This makes it harder than necessary for non-technical users to generate analytics. We address these difficulties by developing a natural language-based system that allows non-technical users to create well-formed questions. Our system, called TR Discover, maps from a fragment of English into an intermediate First Order Logic representation, which is in turn mapped into SPARQL or SQL. The mapping from natural language to logic makes crucial use of a feature-based grammar with full formal semantics. The fragment of English covered by the natural language grammar is domain specific and tuned to the kinds of questions that the system can handle. Because users will not necessarily know what the coverage of the system is, TR Discover offers a novel auto-suggest mechanism that can help users to construct well-formed and useful natural language questions. TR Discover was developed for future use with Thomson Reuters Cortellis, which is an existing product built on top of a linked data system targeting the pharmaceutical domain. Currently, users access it via a keyword-based query interface. We report results and performance measures for TR Discover on Cortellis, and in addition, to demonstrate the portability of the system, on the QALD-4 dataset, which is associated with a public shared task. We show that the system is usable and portable, and report on the relative performance of queries using SQL and SPARQL back ends.
منابع مشابه
TR Discover: A Natural Language Question Answering System for Interlinked Datasets
We propose TR Discover, a question answering system that answers natural language questions over interlinked datasets. Using a feature-based grammar, TR Discover first parses a natural language question to its First Order Logic representation, which is in turn translated into SPARQL or SQL. Because users will not necessarily know what the coverage of the system is, TR Discover offers a novel au...
متن کاملQuerying the Web of Interlinked Datasets using VOID Descriptions
Query processing is an important way of accessing data on the Semantic Web. Today, the Semantic Web is characterized as a web of interlinked datasets, and thus querying the web can be seen as dataset integration on the web. Also, this dataset integration must be transparent from the data consumer as if she is querying the whole web. To decide which datasets should be selected and integrated for...
متن کاملA RADAR for information reconciliation in Question Answering systems over Linked Data
In the latest years, more and more structured data is published on the Web and the need to support typical Web users to access this body of information has become of crucial importance. Question Answering systems over Linked Data try to address this need by allowing users to query Linked Data using natural language. These systems may query at the same time different heterogenous interlinked dat...
متن کاملQuerying Tourism Information Systems in Natural Language
With the increasing amount of information available on the Internet one of the most challenging tasks is to provide search interfaces that are easy to use without having to learn a specific syntax and provides a means for finding what they really want. In this paper we present a query interface for an information system in the tourism domain exploiting the intuitiveness of natural language. Fur...
متن کاملFREyA: An Interactive Way of Querying Linked Data Using Natural Language
Natural Language Interfaces are increasingly relevant for information systems fronting rich structured data stores such as RDF and OWL repositories, mainly because of the conception of them being intuitive for human. In the previous work, we developed FREyA, an interactive Natural Language Interface for querying ontologies. It uses syntactic parsing in combination with the ontology-based lookup...
متن کامل